Improve chanmon_consistency fuzz target performance#4509
Improve chanmon_consistency fuzz target performance#4509joostjager wants to merge 3 commits intolightningdevkit:mainfrom
Conversation
The searched-for log message ("Outbound update_fee HTLC buffer
overflow") no longer exists in the lightning crate, so the
from_utf8 + contains check on every log line was pure waste.
AI tools were used in preparing this commit.
Even though DevNull discards the bytes, the formatting work (SubstringFormatter, fmt::write, from_utf8) was still being done on every log call. Short-circuit in TestLogger::log via a TypeId check, which monomorphization resolves at compile time. AI tools were used in preparing this commit.
|
👋 I see @TheBlueMatt was un-assigned. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4509 +/- ##
==========================================
- Coverage 86.19% 86.18% -0.02%
==========================================
Files 160 160
Lines 107537 107537
Branches 107537 107537
==========================================
- Hits 92693 92679 -14
- Misses 12220 12230 +10
- Partials 2624 2628 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I can now see that on CI the speed up is a bit less than on my local machine. Factor 2.5x in CI. |
fuzz/src/chanmon_consistency.rs
Outdated
| Ok(chain::ChannelMonitorUpdateStatus::InProgress) => { | ||
| let persisted_monitor = mon.clone(); | ||
| LatestMonitorState { | ||
| persisted_monitor_id: monitor_id, | ||
| persisted_monitor, | ||
| pending_monitors: vec![(monitor_id, mon)], | ||
| } | ||
| }, |
There was a problem hiding this comment.
Behavior change: In the old code, persisted_monitor was Vec::new() (empty) for the InProgress case, meaning a reload via use_old_mons % 3 == 0 would panic on deserialization of the empty vec. Now persisted_monitor is a valid clone of the same monitor that goes into pending_monitors.
This means all three reload paths (%3 == 0/1/2) produce identical behavior for a freshly-watched InProgress channel (the same monitor object), whereas before %3 == 0 was a distinct (crash) path. The crash was arguably a test harness bug (not a useful coverage path), so this seems reasonable, but the field comment ("The latest ChannelMonitor that we told LDK we persisted", line 247) is now inaccurate for the InProgress case — we haven't actually told LDK we persisted it.
Consider either updating the comment to reflect this, or wrapping the field in Option<ChannelMonitor> so the InProgress watch case can be represented as None.
| let res = self.chain_monitor.update_channel(channel_id, update); | ||
| let mon = self.persister.take_latest_monitor(&channel_id); |
There was a problem hiding this comment.
Coverage reduction: The old update_channel independently deserialized the latest stored monitor, applied the update via update_monitor(), re-serialized, and stored the result. This verified on every update that:
- The stored monitor data round-trips correctly through serialization
- Updates can be successfully applied to a round-tripped monitor
The new code delegates entirely to chain_monitor.update_channel and captures the result from the persister, deferring round-trip verification to reload boundaries only (line 974–981).
If a particular update introduces a serialization issue that a subsequent update happens to mask before the next reload, the new code won't catch it. This is likely an acceptable trade-off for the 3-4x speedup (more iterations compensate for reduced per-iteration verification), but worth noting as a deliberate coverage change.
There was a problem hiding this comment.
I indeed thought that just serialization is covered well enough elsewhere.
|
I've thoroughly reviewed the entire diff, including checking the underlying No new issues found beyond what was flagged in my prior review. Review SummaryAfter a thorough second pass examining every hunk in the diff and the underlying implementations, I found no new issues beyond those already flagged in my prior review:
Verified as correct:
|
Previously, TestChainMonitor::update_channel would deserialize the monitor from stored bytes, apply the update, and serialize it back. This duplicated the work already done by the inner ChainMonitor, which applies the update to its in-memory monitor and calls the persister. Instead, have TestPersister capture the monitor directly when the real ChainMonitor calls persist. Serialization is deferred until reload_node actually needs the bytes, which happens rarely (only on specific fuzz input bytes that trigger a node restart). This eliminates redundant deserialization and serialization on every monitor update, replacing the expensive serialize-on-every-persist with a cheaper clone. AI tools were used in preparing this commit.
04f3894 to
0621177
Compare
Reduce per-iteration overhead in the chanmon_consistency fuzz target. Together these achieve a 3-4x speedup. This target has proven crucial for finding bugs, so maximizing its iteration rate is relevant.